Naive Bayes and Text Classification I - Introduction and Theory

نویسنده

  • Sebastian Raschka
چکیده

2 Naive Bayes Classification 3 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Posterior Probabilities . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 Class-conditional Probabilities . . . . . . . . . . . . . . . . . . . 5 2.4 Prior Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.6 Multinomial Naive Bayes A Toy Example . . . . . . . . . . . . 9 2.6.1 Maximum-Likelihood Estimates . . . . . . . . . . . . . . . 10 2.6.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.6.3 Additive Smoothing . . . . . . . . . . . . . . . . . . . . . 11

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

In silico prediction of anticancer peptides by TRAINER tool

Cancer is one of the causes of death in the world. Several treatment methods exist against cancer cells such as radiotherapy and chemotherapy. Since traditional methods have side effects on normal cells and are expensive, identification and developing a new method to cancer therapy is very important. Antimicrobial peptides, present in a wide variety of organisms, such as plants, amphibians and ...

متن کامل

Poisson naive Bayes for text classification with feature weighting

In this paper, we investigate the use of multivariate Poisson model and feature weighting to learn naive Bayes text classifier. Our new naive Bayes text classification model assumes that a document is generated by a multivariate Poisson model while the previous works consider a document as a vector of binary term features based on the presence or absence of each term. We also explore the use of...

متن کامل

Bridging the Gap between Naive Bayes and Maximum Entropy Text Classification

Abstract. The naive Bayes and maximum entropy approaches to text classification are typically discussed as completely unrelated techniques. In this paper, however, we show that both approaches are simply two different ways of doing parameter estimation for a common log-linear model of class posteriors. In particular, we show how to map the solution given by maximum entropy into an optimal solut...

متن کامل

An Improved Naive Bayes Text Classification Algorithm In Chinese Information Processing

In Chinese information processing, Naive Bayes is a simple text classification method that is easily implemented. Its core is the realization of the calculating posterior probability algorithm and the effectively reducing dimension for feature words. This paper improved Naive Bayes text classification from the calculating posterior probability and the reducing dimension of feature words of text...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1410.5329  شماره 

صفحات  -

تاریخ انتشار 2014